Rank | Count | Beginning |
---|---|---|
1507 | 1988 | В |
18394 | 1807 | През |
11954 | 964 | На |
23079 | 817 | След |
15215 | 626 | От |
27177 | 534 | Той |
7145 | 464 | За |
16665 | 435 | По |
26555 | 427 | Това |
20328 | 334 | При |
24289 | 272 | Според |
28129 | 257 | Тя |
26029 | 234 | Те |
4566 | 231 | Въпреки |
22369 | 200 | С |
2 | 192 | “ |
8844 | 157 | Има |
9950 | 156 | Когато |
9590 | 151 | Като |
25687 | 138 | Така |
10586 | 126 | Към |
14835 | 118 | Освен |
13969 | 114 | Но |
25568 | 110 | Тази |
26072 | 106 | Тези |
27067 | 105 | Този |
133 | 103 | “, |
17141 | 102 | По-късно |
6016 | 96 | До |
25882 | 95 | Там |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV